# 41 Dot product and orthogonality in Euclidean spaces.

Date modified: Sat 2024-05-18 . 08:11 AM
Date created: Thu 2024-05-16 . 03:31 PM
# 41 Dot product and orthogonality in Euclidean spaces. In $\mathbb R^{n}$ we define the **dot product** between two vectors $x,y$ to be $$ x \cdot y = \sum_{i=1}^{n} x_{1} y_{1} +x_{2}y_{2}+\cdots + x_{n} y_{n} $$where $x = \begin{pmatrix}x_{1}\\x_{2}\\\vdots\\x_{n}\end{pmatrix}$ and $y = \begin{pmatrix}y_{1}\\y_{2}\\\vdots\\y_{n}\end{pmatrix}$. We will sometimes write $\langle x, y\rangle$ to denote the dot product between $x$ and $y$ as well. Also, we will omit the vector arrow symbol $\vec{\ }$ for more readability, but you will have to infer from context what is a scalar, what is a vector, and what is a matrix. However, to emphasize something is a vector I will put the arrow symbol. ## Matrix formulation for dot product: $x^{T}y$. A useful computation notation is to treat vectors $x,y \in \mathbb R^{n}$ as $n \times 1$ column vectors, and so we have $$ \begin{align*} x \cdot y &= x^{T} y \\ &= \begin{bmatrix}x_{1} & x_{2} & \cdots & x_{n}\end{bmatrix} \begin{bmatrix}y_{1}\\y_{2}\\\vdots\\y_{n}\end{bmatrix} \\ &= x_{1}y_{1}+ x_{2}y_{2} + \cdots +x_{n}y_{n} \\ &= \sum_{i=1}^{n} x_{i}y_{i}. \end{align*}$$Now the astute reader might realize that technically $x \cdot y$ is a scalar, while the matrix product $x^{T}y$ produces a $1 \times 1$ matrix. While technically true, it is useful later for computational purposes to think of this way. So we interpret $x^{T} y$ to be the scalar entry of the $1\times1$ matrix when it is in a dot product context, whenever it is appropriate. ## Main properties of dot product in $\mathbb R^{n}$. Over the reals, the dot product satisfies three defining characteristics: > **Main properties of the real dot product.** > Let $x,y,z$ be vectors in $\mathbb R^{n}$ and $c$ a real scalar. Then we have > (1) (Positive-definiteness) $x \cdot x \ge 0$. And $x \cdot x = 0$ if and only if $x = \vec 0$. > (2) (Symmetry) $x \cdot y = y \cdot x$. > (3) (Linearity in both components) $$ \begin{array}{} (c x) \cdot y = c(x \cdot y) \\ (x+y) \cdot z = x\cdot z + y \cdot z \\ x\cdot (cy) = c (x\cdot y) \\ x\cdot(y+z) = x\cdot y + x\cdot z \end{array} $$ These properties are not too hard to verify by using the definition of dot product directly. ### Commentary on complex dot product. Over the complex linear space $\mathbb C^{n}$, one could also define a complex dot product, but we have to modify it. If $x,y \in \mathbb C^{n}$, then $$ x \cdot y = \sum_{i=1}^{n} x_{i}\bar y_{i}= x_{1} \bar y_{1} +x_{2} \bar y_{2} + \cdots + x_{n} \bar y_{n} $$ where the bar denotes complex conjugation, where $\overline{a+i b} = a-ib$. The reason for this modification is so that the complex dot product is still positive-definite. However, it loses symmetry and linearity in both components. Instead, we have > **Main properties of the complex dot product.** > Let $x,y,z$ be vectors in $\mathbb C^{n}$ and $c$ a complex scalar. Then we have > (1) (Positive definiteness) $x\cdot x\ge 0$. And $x \cdot x = 0$ if and only if $x = \vec 0$. > (2) (Conjugate symmetry) $x\cdot y = \color{blue}\overline{y\cdot x}$. > (3) (Linearity in first component, half linear in the second component) $$ \begin{array}{} (c x) \cdot y = c(x \cdot y) \\ (x+y) \cdot z = x\cdot z + y \cdot z \\ x\cdot (cy) = \color {blue}\bar c (x\cdot y) \\ x\cdot(y+z) = x\cdot y + x\cdot z \end{array} $$ When the scalars is factored out from the second component of the dot product, it become complex conjugated. To this end we say the complex dot product is _sesquilinear_, or _one-and-a-half_ linear. We would also need to modify the matrix formulation for the complex dot product if $x,y \in \mathbb C^{n}$. We take $$ x\cdot y = y^{H}x $$where $y^{H}$ is called the _conjugate transpose_, or _Hermitian transpose_, where you transpose the vector, and take complex conjugate of each of the entries. That is, $$ y^{H} = \begin{bmatrix}y_{1}\\y_{1}\\\vdots\\y_{n}\end{bmatrix}^{H}=\begin{bmatrix}\bar y_{1} & \bar y_{2} & \cdots & \bar y_{n}\end{bmatrix} $$ And the reason the order is switched is because it is the second component that is conjugate linear, and we want to produce a $1\times 1$ matrix. In physics, some redefines the complex dot product so that it is the first component that is conjugate linear, which would give a more obvious matrix formulation. Possibly one of very few things I agree with physicists. Notice that the complex dot product definition is consistent with the real case, since when you take complex conjugate of real numbers it does nothing. So the complex definition is a generalization of the real case. And later when one generalizes to abstract inner product, this is the definition one takes. In any case, we won't deal with too much of the complex case, just to make you aware this exists. ## Geometric measure with dot product. One may ask, ok but what _is_ dot product? One answer is that it is an algebraic gadget that measures geometric lengths, distances, angles, and orthogonality among vectors. This is a good intuition to have. ### Length. > Given $x \in \mathbb R^{n}$, define its **length** (or **norm**, or **magnitude**) to be $$ \Vert x \Vert = \sqrt{x\cdot x} = \sqrt{\sum_{i = 1}^{n} x_{i}^{2}} $$where $x_{i}$ are the components of $x$. It is sometimes more useful to think about the square of the norm to avoid square roots, $$ \Vert x \Vert^{2} = x \cdot x. $$ > If $c$ is a scalar, then we have $$ \Vert c x\Vert = |c| \Vert x \Vert. $$ A vector $u$ is said to be a **unit vector** if its length is $1$, that is, $\Vert u\Vert = 1$. If $x \neq \vec 0$ is a nonzero vector, then we can perform a **normalization** of $x$, to produce a unit vector in the same direction of $x$ but with length 1 by dividing out the length of $x$, namely $$ \frac{x}{\Vert x\Vert}. $$ For instance the normalization of $x = \begin{pmatrix}1\\2\\3\end{pmatrix}$ is $$ \frac{x}{\Vert x\Vert} = \frac{1}{\sqrt{14}}\begin{pmatrix}1\\2\\3\end{pmatrix}. $$ ### Distance. > The distance between two vectors $x,y \in \mathbb R^{n}$ is given by $$ \text{dist}(x,y) = \Vert x-y\Vert. $$ One can see why this is a sensible definition by drawing a diagram. It is good to remember how it can be rephrased in dot product, that $$ \Vert x-y\Vert = \sqrt{(x-y)\cdot(x-y)} $$ And often the square of the distance is easier to work with, to avoid square roots. ### Angle. We claim that > For $x,y \in \mathbb R^n$, we have $$ x \cdot y = \Vert x \Vert \Vert y \Vert \cos\theta $$where $\theta$ is the angle between the vectors $x,y$. ![[smc-spring-2024-math-13/linear-algebra-notes/---files/dot-prod-angle.svg]] To prove this, we recall law of cosine, where for a triangle of sides $a,b,c$ with $\theta$ the angle opposing side $c$, we have $$ a^{2}+b^{2} = c^{2} +2ab \cos\theta $$![[smc-spring-2024-math-13/linear-algebra-notes/---files/law-of-cosine.svg]] You want to remember this is as a _generalization of the Pythagorean theorem_. So for vectors $x,y$, form a triangle with the third side being the vector $x-y$. Then applying law of cosine we have $$ \Vert x \Vert^{2} +\Vert y\Vert^{2} = \Vert x-y\Vert^{2} +2\Vert x\Vert \Vert y\Vert \cos\theta $$Now, by expanding out $\Vert x-y\Vert^{2}$, we see that $$ \begin{align*} \Vert x-y\Vert^{2} &=(x-y)\cdot(x-y) \\ & = x\cdot x - x\cdot y-y\cdot x+y\cdot y \\ & = \Vert x\Vert^{2} -2(x\cdot y) +\Vert y \Vert^{2} \end{align*} $$So substituting it back into the law of cosine expression, we have $$ \begin{array}{cl} & \Vert x \Vert^{2} +\Vert y\Vert^{2} = \Vert x-y\Vert^{2} +2\Vert x\Vert \Vert y\Vert \cos\theta \\ \implies & \Vert x \Vert^{2} +\Vert y\Vert^{2} = \Vert x\Vert^{2} -2(x\cdot y )+\Vert y \Vert^{2} +2\Vert x\Vert \Vert y\Vert \cos\theta \\ \implies & 2( x\cdot y) = 2\Vert x \Vert \Vert y\Vert \cos\theta \\ \implies & x\cdot y = \Vert x \Vert \Vert y \Vert \cos\theta. \end{array} $$ This shows the dot product is really measuring the (cosine) of the angle between the vectors, where $$ \cos\theta = \frac{x\cdot y}{\Vert x\Vert \Vert y \Vert} $$provided that $x,y$ are not zero vectors. ### Orthogonality. When the angle $\theta$ between two vectors $x,y$ is $90^{\circ} = \frac{\pi}{2}$, then as $\cos(90^{\circ})=0$, we have $x\cdot y = 0$. This shows if $x$ and $y$ are perpendicular to each other, then $x \cdot y = 0$. But this also happens when one of $x$ or $y$ is the zero vector $\vec 0$. So we make a more encompassing definition as follows > We say $x,y \in \mathbb R^{n}$ are **orthogonal** to each other if $$ x \cdot y = 0 $$In this case, we will also write $x \perp y$ when $x\cdot y = 0$. ## Cauchy-Schwarz inequality. Observe if we have $x,y \in \mathbb R^{n}$, then $$ x \cdot y = \Vert x \Vert \Vert y \Vert \cos\theta, $$and since $|\cos \theta| \le 1$, we have the following result > **Cauchy-Schwarz inequality.** > For any two vectors $x,y \in \mathbb R^{n}$, we have $$ |x \cdot y| \le \Vert x\Vert \Vert y\Vert $$Furthermore, we have equality $$ |x\cdot y| = \Vert x \Vert \Vert y\Vert \iff x,y \text{ are linearly dependent} $$ This equality case is precisely when $|\cos\theta|=1$, which is when $\theta = 0$ or $\pi$, precisely when $x,y$ are scalar multiples of each other. This is perhaps one of the _most important inequality in all of mathematics_. Especially in analysis (calculus). By the way, Cauchy is pronounced _KOSHY_. To demonstrate some of Cauchy-Schwarz power, consider the following examples. **Example.** Suppose three real numbers $a,b,c$ are such that $a+b+c = 1$. Show we always have $$ a^{2}+b^{2}+c^{2} \ge \frac{1}{3}. $$Furthermore, show that the minimum $a^{2}+b^{2}+c^{2} = \frac{1}{3}$ is achieved only when $a=b=c=\frac{1}{3}$. $\blacktriangleright$ To establish the inequality using Cauchy-Schwarz, we need to introduce vectors and dot product into the picture somehow. Note $$ a+b+c = \begin{pmatrix}1\\1\\1\end{pmatrix} \cdot \begin{pmatrix}a\\b\\c\end{pmatrix} $$So applying Cauchy-Schwarz, we have $$ |\begin{pmatrix}1\\1\\1\end{pmatrix}\cdot \begin{pmatrix}a\\b\\c \end{pmatrix} |\le \Vert \begin{pmatrix}1\\1\\1\end{pmatrix}\Vert \Vert \begin{pmatrix}a\\b\\c\end{pmatrix}\Vert =\sqrt{3} \sqrt{a^{2}+b^{2}+c^{2}} $$ Since $a+b+c =1$, we therefor have $$ 1 \le \sqrt{3} \sqrt{a^{2}+b^{2}+c^{2}} \implies a^{2}+b^{2}+c^{2} \ge \frac{1}{3} $$as claimed. Furthermore, Cauchy-Schwarz tells us the equality case holds only when $$ \begin{pmatrix}1\\1\\1\end{pmatrix} \text{ and } \begin{pmatrix}a\\b\\c\end{pmatrix} \text{ are linearly dependent.} $$So we have $$ \begin{pmatrix}a\\b\\c\end{pmatrix} = \lambda \begin{pmatrix}1\\1\\1\end{pmatrix} $$so $a=b=c=\lambda$, for some scalar $\lambda$. Since $a+b+c = 1$, we have $\lambda+\lambda+\lambda=1$, whence $\lambda = \frac{1}{3}$. So the minimum is achieved precisely when $$ a=b=c= \frac{1}{3} $$as claimed! $\blacklozenge$ **Example.** Consider the function $f(x,y,z) = 3x- 2y + 5z$. Find $(x,y,z)$ such that $f$ is maximized over the sphere $x^{2}+y^{2}+z^{2} = 7$, and find the maximized value. You must show it is a maximum. $\blacktriangleright$ This looks like a calculus problem with Lagrange multipliers. But we merely need to apply Cauchy-Schwarz in a clever way. Again, get vectors and dot products into the picture. Note that $$ f(x,y,z) = 3x - 2y + 5z = \begin{pmatrix}3\\-2\\5\end{pmatrix} \cdot \begin{pmatrix}x\\y\\z\end{pmatrix} $$So $$ |f(x,y,z)| \le \Vert \begin{pmatrix}3\\-2\\5\end{pmatrix}\Vert \Vert \begin{pmatrix}x\\y\\z\end{pmatrix}\Vert = \sqrt{38} \sqrt{x^{2}+y^{2}+z^{2}} = \sqrt{266} $$as we are given the constraint $x^{2}+y^{2}+z^{2}=7$. This shows $|f(x,y,z)|$ is bounded by $\sqrt{266}$, and in fact equality is achievable when $$ \begin{pmatrix}x\\y\\z\end{pmatrix} = \lambda \begin{pmatrix}3\\-2\\5\end{pmatrix} $$that is, when those two vectors are linearly dependent, again by Cauchy-Schwarz. So with $x=3\lambda, y=-2\lambda,z=5\lambda$, we solve for $\lambda$ by setting $$ \begin{array}{} f(3\lambda,-2\lambda,5\lambda) = \sqrt{266} \\ \implies 9\lambda+4\lambda+25\lambda = \sqrt{266} \\ \implies \lambda = \frac{\sqrt{266}}{38} \end{array} $$So, $f(x,y,z)= 3x- 2y + 5z$ constrained on the sphere $x^{2}+y^{2}+z^{2}=7$ has maximum value of $\sqrt{266}$, at $(x,y,z) = (\frac{3\sqrt{266}}{38}, \frac{-2\sqrt{266}}{38},\frac{25\sqrt{266}}{38})$. $\blacklozenge$ ## Pythagorean theorem and triangle inequality. We also have some nice results arising from orthogonality and Cauchy-Schwarz. > **Pythagorean theorem.** > Let $x,y \in \mathbb R^{n}$. Then $x$ and $y$ are orthogonal if and only if $$ \Vert x\Vert^{2} + \Vert y \Vert^{2} = \Vert x + y\Vert^{2}. $$ $\blacktriangleright$ ($\implies$) Suppose $x$ and $y$ are orthogonal, then $x\cdot y = 0$. Now we expand $$ \Vert x+y\Vert^{2} = (x+y) \cdot (x+y) = \Vert x\Vert^{2}+{\color{blue}2(x\cdot y)} +\Vert y\Vert^{2} = \Vert x\Vert^{2} + \Vert y\Vert^{2}. $$ ($\impliedby$) Suppose we have equality $$ \Vert x\Vert^{2} + \Vert y \Vert^{2} = \Vert x + y\Vert^{2}. $$If we expand out the right hand side again we have $$ \begin{array}{} \Vert x+y\Vert^{2} = (x+y) \cdot (x+y) = \Vert x\Vert^{2}+{\color{blue}2(x\cdot y)} +\Vert y\Vert^{2} \\ \implies \Vert x\Vert^{2} + \Vert y \Vert^{2} = \Vert x\Vert^{2}+{\color{blue}2(x\cdot y)} +\Vert y\Vert^{2} \\ \implies x\cdot y = 0. \quad \blacksquare \end{array} $$ > **Triangle inequality.** > Let $x,y \in \mathbb R^{n}$. Then we always have $$ \Vert x+y\Vert \le \Vert x\Vert +\Vert y\Vert $$ $\blacktriangleright$ We first expand the expression $$ \Vert x+y\Vert^{2} = (x+y) \cdot (x+y) = \Vert x \Vert^{2} +{\color{blue}2 x\cdot y} + \Vert y \Vert^{2} $$And applying Cauchy-Schwarz on the dot product gives $$ \begin{align*} \Vert x + y\Vert^{2} & = \Vert x \Vert^{2} +2 x\cdot y + \Vert y \Vert^{2} \\ & \le \Vert x \Vert^{2} +2 \Vert x\Vert \Vert y \Vert + \Vert y \Vert^{2} \\ & =(\Vert x\Vert+\Vert y\Vert)^{2} \end{align*} $$where the last step is factoring. Hence, by taking square roots we have $$ \Vert x + y \Vert \le \Vert x\Vert + \Vert y \Vert $$as desired. $\blacksquare$ ## Orthogonal complement of a subspace. First we say $W$ is a **subspace** of $\mathbb R^{n}$ if it is a _linear space_ inside $\mathbb R^{n}$. Note in this case we can write $W = \operatorname{span}(w_{1},w_{2},\ldots, w_{\ell})$ for some vectors $w_{1},w_{2},\ldots,w_{\ell} \in \mathbb R^{n}$. > We define the **orthogonal complement** of the subspace $W$ in $\mathbb R^{n}$ to be another set $W^{\perp}$ (the symbol $\perp$ is pronounced _"perp"_) where $$ W^{\perp} = \{x\in \mathbb R^{n} : \forall y\in W, x\cdot y=0\}. $$That is, $W^{\perp}$ is the set of all vectors in $\mathbb R^{n}$ that are orthogonal to every vector in $W$. **Example.** Let $W = \operatorname{span}\{\begin{pmatrix}1\\0\\1\end{pmatrix},\begin{pmatrix}2\\1\\1\end{pmatrix}\}$, which is a plane in $\mathbb R^{3}$, then $W^{\perp}$ is the set of vectors on the line perpendicular to $W$. A diagram is as follows: ![[smc-spring-2024-math-13/linear-algebra-notes/---files/orth-comp.svg]] In particular, vectors in $W^{\perp}$ needs to be perpendicular to each of the basis vectors of $W$. So if $x = \begin{pmatrix}x\\y\\z\end{pmatrix} \in W^{\perp}$ , then we need $$ \begin{align*} \begin{pmatrix}1\\0\\1\end{pmatrix} \cdot \begin{pmatrix}x\\y\\z\end{pmatrix} = 0 \\ \begin{pmatrix}2\\1\\1\end{pmatrix} \cdot \begin{pmatrix}x\\y\\z\end{pmatrix} = 0 \end{align*} $$ This gives a system of linear equations $$ \begin{align*} x + z = 0 \\ 2x + y + z = 0 \end{align*} $$which we can solve by reducing the augmented matrix $$ \left[\begin{array}{ccc|c} 1 & 0 & 1 & 0 \\ 2 & 1 & 1 & 0 \end{array}\right] \stackrel{\text{row}}\sim \left[\begin{array}{ccc|c} 1 & 0 & 1 & 0 \\ 0 & 1 & -1 & 0 \end{array}\right] $$which show $W^{\perp} = \operatorname{span}\{\begin{pmatrix}-1\\1\\1\end{pmatrix}\}$. $\blacklozenge$ **Remark.** This calculation of $W^{\perp}$ shows that it is the nullspace of some matrix, a matrix whose rows are the columns of basis vectors of $W$! We will record this computational result below. But first, some properties. ### Properties of orthogonal complement. We have the following properties about the orthogonal complement $W^{\perp}$: > **Properties of orthogonal complement.** > If $W$ is a subspace of $\mathbb R^{n}$, then > (1) $W^{\perp}$ is also a subspace of $\mathbb R^{n}$. > (2) $\dim W + \dim W^{\perp} = n$. > (3) $(W^{\perp})^{\perp} = W$. **Remark**. You might make the observation that (2) looks like rank-nullity theorem. And indeed that is the case, there is a linear map whose range is some $W$ and kernel is $W^{\perp}$, the _orthogonal projection map onto $W$_. We will discuss this later. Also, you might remark that property (3) looks obvious. As it turns out, (3) is not always true in infinite dimensional cases. ### How to compute $W^{\perp}$. Since $W^{\perp}$ is also a linear space, we should be able to express it as a span of some vectors, or find a basis for it. Let us see how to compute it, given a subspace $W$. Suppose we are given a subspace $W$ where $W = \operatorname{span}(w_{1},w_{2},\ldots, w_{\ell})$, then any vector $x \in W^{\perp}$ must be orthogonal to each of $w_{1},w_{2},\ldots,w_{l}$. This gives the following system of linear equations $$ \begin{align*} w_{1} \cdot x = 0 &\implies w_{1}^{T} x = 0 \\ w_{2} \cdot x = 0 &\implies w_{2}^{T} x = 0 \\ &\vdots \\ w_{\ell} \cdot x = 0 &\implies w_{\ell}^{T} x = 0 \end{align*} $$ But if we think about $w_{i}^{T} x = 0$ as a matrix expression, and put them all in rows, we have $$ \begin{align*} \begin{bmatrix} w_{1} ^{T}x \\ w_{2} ^{T}x \\ \vdots \\ w_{\ell} ^{T}x \\ \end{bmatrix} = \begin{bmatrix}0\\0\\\vdots\\0\end{bmatrix} \\ \implies \begin{bmatrix} - & w_{1}^{T} & - \\ - & w_{2}^{T} & - \\ & \vdots \\ - & w_{\ell}^{T} & -\end{bmatrix} x = \vec 0 \end{align*} $$ This shows the vector $x \in W^{\perp}$ actually is in the nullspace of some matrix! So we have the following computational result > Computation for $W^{\perp}$. > Suppose $W = \operatorname{span}(w_{1},w_{2},\ldots, w_{\ell})$, consider the matrix $$ A = \begin{bmatrix} | & | & & |\\w_{1} & w_{2} & \cdots & w_{\ell} \\ | & | & & |\end{bmatrix} $$where we put $w_{i}$ as columns. Then $$ W^{\perp} = \text{NS} (A^{T}). $$Also, observe that $W =\text{CS}(A)$, the columnspace of $A$. By the way, recall that the nullspace of $A$ is just the kernel of the linear map $x\mapsto Ax$, while columnspace of $A$ is the span of the columns of $A$, or the range of the linear map $x\mapsto Ax$. **Example.** Let $W$ be a subspace in $\mathbb R^{3}$ such that $$ W = \operatorname{span} \{\begin{pmatrix}1\\2\\3\end{pmatrix},\begin{pmatrix}1\\1\\1\end{pmatrix}\} $$Find $W^{\perp}$ by writing it as a span of some vectors, and find a basis for $W^{\perp}$. What are the dimensions of $W$ and $W^{\perp}$? $\blacktriangleright$ First we construct the matrix $$ A = \begin{pmatrix}1 & 1 \\ 2 & 1 \\ 3 & 1\end{pmatrix} $$Then $W =\text{CS}(A)$ and $W^{\perp} =\text{NS}(A^{T})$. By row reduction, we see that $$ \begin{align*} W^{\perp} = NS(A^{T}) &= NS\begin{pmatrix}1 & 2 & 3\\1 & 1 & 1\end{pmatrix} \\ &= NS \begin{pmatrix}1 & 0 & -1\\0 & 1 & 2\end{pmatrix} \\ & = \operatorname{span}\{ \begin{pmatrix}1\\-2\\1\end{pmatrix} \} . \end{align*} $$ So a basis to $W^{\perp}$ is $$ \{ \begin{pmatrix}1\\-2\\1\end{pmatrix} \} $$Also $\dim W= 2$ and $\dim W^{\perp} = 1$. $\blacklozenge$ In above example, one can check that indeed the basis vector $\begin{pmatrix}1\\-2\\1\end{pmatrix}$ of $W^{\perp}$ is orthogonal to both the (basis) vectors $\begin{pmatrix}1\\2\\3\end{pmatrix}$ and $\begin{pmatrix}1\\1\\1\end{pmatrix}$ of $W$ by taking dot product! ## Fundamental theorem of linear algebra. If you carefully think about the example above, where $W =\text{CS}(A)$ and $W^{\perp}=\text{NS}(A^{T})$, we conclude the following simple but central result: > **Fundamental theorem of linear algebra.** > Let $A$ be any $n\times k$ real matrix. Then we always have $$ \text{CS}(A)^{\perp} =\text{NS}(A^{T}) \text{ and } \text{CS}(A^{T})^{\perp} = \text{NS}(A). $$ By taking orthogonal complements (perp) and transposes, one can also write down the other two equalities, $$ \text{CS}(A) = \text{NS}(A^{T})^{\perp} \text{ and } \text{CS}(A^{T}) = \text{NS}(A)^{\perp}. $$ In short, this is saying that among the four **fundamental subspaces** $\text{CS}(A),\text{NS}(A),\text{CS}(A^{T}),\text{NS}(A^{T})$ that we can write down for a matrix $A$, we have $$ \begin{align*} \text{CS}(A) \text{ and } \text{NS}(A^{T}) \text{ are orthogonal complements of each other,} \\ \text{CS}(A^{T}) \text{ and } \text{NS}(A) \text{ are orthogonal complements of each other.} \end{align*} $$Isn't this a beautiful yet simple result! Schematically, it would look something like this ![[smc-spring-2024-math-13/linear-algebra-notes/---files/ftla.svg]] Observe that if $A$ is of size $n \times k$, then $A^{T}$ is $k\times n$, and that: - As a linear map, $x \mapsto Ax$ has domain $\mathbb R^{k}$ and codomain $\mathbb R^{n}$ - As a linear map, $x \mapsto A^{T} x$ has domain $\mathbb R^{n}$ and codomain $\mathbb R^{k}$ - $CS(A)$ is a subspace of $\mathbb R^{n}$ - $NS(A)$ is a subspace of $\mathbb R^{k}$ - $CS(A^{T})$ is a subspace of $\mathbb R^{k}$ - $NS(A^{T})$ is a subspace of $\mathbb R^{n}$ The term _fundamental theorem of linear algebra_ is coined by Gilbert Strang (I believe), where we have a fundamental relation between the four basic _fundamental subspaces_ relating to a matrix $A$. The term _fundamental subspaces_ is probably also coined by Strang. **Example.** Let us illustrate the fundamental theorem of linear algebra. Consider the matrix $$ A = \begin{pmatrix}1 & 1\\2 & 2\end{pmatrix}. $$If we calculate the four fundamental subspaces, we have $$ \begin{align*} CS(A) &= CS\begin{pmatrix}1 & 1\\2 & 2\end{pmatrix}=\operatorname{span} \begin{pmatrix}1\\2\end{pmatrix} \\ NS(A) &= NS \begin{pmatrix}1 & 1\\2 & 2\end{pmatrix} =\operatorname{span} \begin{pmatrix}-1\\1\end{pmatrix} \\ CS(A^{T}) &= CS \begin{pmatrix}1 & 2\\1 & 2\end{pmatrix} = \operatorname{span} \begin{pmatrix}1\\1\end{pmatrix} \\ NS(A^{T}) &= NS\begin{pmatrix}1 & 2\\1 & 2\end{pmatrix} =\operatorname{span} \begin{pmatrix}-2\\1\end{pmatrix}. \end{align*} $$ Observe we have orthogonality $CS(A) \perp NS(A^{T})$ and $CS(A^{T}) \perp NS(A)$! A sketch of these subspaces are as follows ![[smc-spring-2024-math-13/linear-algebra-notes/---files/ftla-ex.svg]]